Integrating a Discriminative Classifier into Phrase-based and Hierarchical Decoding
نویسندگان
چکیده
Current state-of-the-art statistical machine translation (SMT) relies on simple feature functionswhichmake independence assumptions at the level of phrases or hierarchical rules. However, it is well-known that discriminative models can benefit from rich features extracted from the source sentence context outside of the applied phrase or hierarchical rule, which is available at decoding time. We present a framework for the open-source decoder Moses that allows discriminative models over source context to easily be trained on a large number of examples and then be included as feature functions in decoding.
منابع مشابه
Discriminative Reordering Extensions for Hierarchical Phrase-Based Machine Translation
In this paper, we propose novel extensions of hierarchical phrase-based systems with a discriminative lexicalized reordering model. We compare different feature sets for the discriminative reordering model and investigate combinations with three types of non-lexicalized reordering rules which are added to the hierarchical grammar in order to allow for more reordering flexibility during decoding...
متن کاملHierarchical Phrase-based Stream Decoding
This paper proposes a method for hierarchical phrase-based stream decoding. A stream decoder is able to take a continuous stream of tokens as input, and segments this stream into word sequences that are translated and output as a stream of target word sequences. Phrase-based stream decoding techniques have been shown to be effective as a means of simultaneous interpretation. In this paper we tr...
متن کاملPhrasal: A Toolkit for Statistical Machine Translation with Facilities for Extraction and Incorporation of Arbitrary Model Features
We present a new Java-based open source toolkit for phrase-based machine translation. The key innovation provided by the toolkit is to use APIs for integrating new features (/knowledge sources) into the decoding model and for extracting feature statistics from aligned bitexts. The package was used to develop a number of useful features written to these APIs including features for hierarchical r...
متن کاملPhrasal: A Statistical Machine Translation Toolkit for Exploring New Model Features
We present a new Java-based open source toolkit for phrase-based machine translation. The key innovation provided by the toolkit is to use APIs for integrating new features (/knowledge sources) into the decoding model and for extracting feature statistics from aligned bitexts. The package includes a number of useful features written to these APIs including features for hierarchical reordering, ...
متن کاملIntegrating Case Frame into Japanese to Chinese Hierarchical Phrase-based Translation Model
This paper presents a novel approach to enhance hierarchical phrase-based (HPB) machine translation systems with case frame (CF).we integrate the Japanese shallow CF into both rule extraction and decoding. All of these rules are then employed to decode new sentences in Japanese with source language case frame. The results of experiments carried out on Japanese-Chinese test sets. It shows that o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Prague Bull. Math. Linguistics
دوره 101 شماره
صفحات -
تاریخ انتشار 2014